502 research outputs found
Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms
We consider stochastic multi-armed bandits where the expected reward is a
unimodal function over partially ordered arms. This important class of problems
has been recently investigated in (Cope 2009, Yu 2011). The set of arms is
either discrete, in which case arms correspond to the vertices of a finite
graph whose structure represents similarity in rewards, or continuous, in which
case arms belong to a bounded interval. For discrete unimodal bandits, we
derive asymptotic lower bounds for the regret achieved under any algorithm, and
propose OSUB, an algorithm whose regret matches this lower bound. Our algorithm
optimally exploits the unimodal structure of the problem, and surprisingly, its
asymptotic regret does not depend on the number of arms. We also provide a
regret upper bound for OSUB in non-stationary environments where the expected
rewards smoothly evolve over time. The analytical results are supported by
numerical experiments showing that OSUB performs significantly better than the
state-of-the-art algorithms. For continuous sets of arms, we provide a brief
discussion. We show that combining an appropriate discretization of the set of
arms with the UCB algorithm yields an order-optimal regret, and in practice,
outperforms recently proposed algorithms designed to exploit the unimodal
structure.Comment: ICML 2014 (technical report). arXiv admin note: text overlap with
arXiv:1307.730
Dynamic Rate and Channel Selection in Cognitive Radio Systems
In this paper, we investigate dynamic channel and rate selection in cognitive
radio systems which exploit a large number of channels free from primary users.
In such systems, transmitters may rapidly change the selected (channel, rate)
pair to opportunistically learn and track the pair offering the highest
throughput. We formulate the problem of sequential channel and rate selection
as an online optimization problem, and show its equivalence to a {\it
structured} Multi-Armed Bandit problem. The structure stems from inherent
properties of the achieved throughput as a function of the selected channel and
rate. We derive fundamental performance limits satisfied by {\it any} channel
and rate adaptation algorithm, and propose algorithms that achieve (or
approach) these limits. In turn, the proposed algorithms optimally exploit the
inherent structure of the throughput. We illustrate the efficiency of our
algorithms using both test-bed and simulation experiments, in both stationary
and non-stationary radio environments. In stationary environments, the packet
successful transmission probabilities at the various channel and rate pairs do
not evolve over time, whereas in non-stationary environments, they may evolve.
In practical scenarios, the proposed algorithms are able to track the best
channel and rate quite accurately without the need of any explicit measurement
and feedback of the quality of the various channels.Comment: 19 page
Mixed Polling with Rerouting and Applications
Queueing systems with a single server in which customers wait to be served at
a finite number of distinct locations (buffers/queues) are called discrete
polling systems. Polling systems in which arrivals of users occur anywhere in a
continuum are called continuous polling systems. Often one encounters a
combination of the two systems: the users can either arrive in a continuum or
wait in a finite set (i.e. wait at a finite number of queues). We call these
systems mixed polling systems. Also, in some applications, customers are
rerouted to a new location (for another service) after their service is
completed. In this work, we study mixed polling systems with rerouting. We
obtain their steady state performance by discretization using the known pseudo
conservation laws of discrete polling systems. Their stationary expected
workload is obtained as a limit of the stationary expected workload of a
discrete system. The main tools for our analysis are: a) the fixed point
analysis of infinite dimensional operators and; b) the convergence of Riemann
sums to an integral.
We analyze two applications using our results on mixed polling systems and
discuss the optimal system design. We consider a local area network, in which a
moving ferry facilitates communication (data transfer) using a wireless link.
We also consider a distributed waste collection system and derive the optimal
collection point. In both examples, the service requests can arrive anywhere in
a subset of the two dimensional plane. Namely, some users arrive in a
continuous set while others wait for their service in a finite set. The only
polling systems that can model these applications are mixed systems with
rerouting as introduced in this manuscript.Comment: to appear in Performance Evaluatio
Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms
We consider stochastic multi-armed bandit problems where the expected reward
is a Lipschitz function of the arm, and where the set of arms is either
discrete or continuous. For discrete Lipschitz bandits, we derive asymptotic
problem specific lower bounds for the regret satisfied by any algorithm, and
propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitz
structure of the problem. In fact, we prove that OSLB is asymptotically
optimal, as its asymptotic regret matches the lower bound. The regret analysis
of our algorithms relies on a new concentration inequality for weighted sums of
KL divergences between the empirical distributions of rewards and their true
distributions. For continuous Lipschitz bandits, we propose to first discretize
the action space, and then apply OSLB or CKL-UCB, algorithms that provably
exploit the structure efficiently. This approach is shown, through numerical
experiments, to significantly outperform existing algorithms that directly deal
with the continuous set of arms. Finally the results and algorithms are
extended to contextual bandits with similarities.Comment: COLT 201
Hierarchical Beamforming: Resource Allocation, Fairness and Flow Level Performance
We consider hierarchical beamforming in wireless networks. For a given
population of flows, we propose computationally efficient algorithms for fair
rate allocation including proportional fairness and max-min fairness. We next
propose closed-form formulas for flow level performance, for both elastic (with
either proportional fairness and max-min fairness) and streaming traffic. We
further assess the performance of hierarchical beamforming using numerical
experiments. Since the proposed solutions have low complexity compared to
conventional beamforming, our work suggests that hierarchical beamforming is a
promising candidate for the implementation of beamforming in future cellular
networks.Comment: 34 page
Multipath streaming: fundamental limits and efficient algorithms
We investigate streaming over multiple links. A file is split into small
units called chunks that may be requested on the various links according to
some policy, and received after some random delay. After a start-up time called
pre-buffering time, received chunks are played at a fixed speed. There is
starvation if the chunk to be played has not yet arrived. We provide lower
bounds (fundamental limits) on the starvation probability of any policy. We
further propose simple, order-optimal policies that require no feedback. For
general delay distributions, we provide tractable upper bounds for the
starvation probability of the proposed policies, allowing to select the
pre-buffering time appropriately. We specialize our results to: (i) links that
employ CSMA or opportunistic scheduling at the packet level, (ii) links shared
with a primary user (iii) links that use fair rate sharing at the flow level.
We consider a generic model so that our results give insight into the design
and performance of media streaming over (a) wired networks with several paths
between the source and destination, (b) wireless networks featuring spectrum
aggregation and (c) multi-homed wireless networks.Comment: 24 page
- …